1,310 research outputs found
Video Captioning via Hierarchical Reinforcement Learning
Video captioning is the task of automatically generating a textual
description of the actions in a video. Although previous work (e.g.
sequence-to-sequence model) has shown promising results in abstracting a coarse
description of a short video, it is still very challenging to caption a video
containing multiple fine-grained actions with a detailed description. This
paper aims to address the challenge by proposing a novel hierarchical
reinforcement learning framework for video captioning, where a high-level
Manager module learns to design sub-goals and a low-level Worker module
recognizes the primitive actions to fulfill the sub-goal. With this
compositional framework to reinforce video captioning at different levels, our
approach significantly outperforms all the baseline methods on a newly
introduced large-scale dataset for fine-grained video captioning. Furthermore,
our non-ensemble model has already achieved the state-of-the-art results on the
widely-used MSR-VTT dataset.Comment: CVPR 2018, with supplementary materia
Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform
Analyzing the ocean acoustic environment is a tricky task. Background noise
and variable channel transmission environment make it complicated to implement
accurate ship-radiated noise recognition. Existing recognition systems are weak
in addressing the variable underwater environment, thus leading to
disappointing performance in practical application. In order to keep the
recognition system robust in various underwater environments, this work
proposes an adaptive generalized recognition system - AGNet (Adaptive
Generalized Network). By converting fixed wavelet parameters into fine-grained
learnable parameters, AGNet learns the characteristics of underwater sound at
different frequencies. Its flexible and fine-grained design is conducive to
capturing more background acoustic information (e.g., background noise,
underwater transmission channel). To utilize the implicit information in
wavelet spectrograms, AGNet adopts the convolutional neural network with
parallel convolution attention modules as the classifier. Experiments reveal
that our AGNet outperforms all baseline methods on several underwater acoustic
datasets, and AGNet could benefit more from transfer learning. Moreover, AGNet
shows robust performance against various interference factors
Regional surname affinity: a spatial network approach
OBJECTIVE
We investigate surname affinities among areas of modern‐day China, by constructing a spatial network, and making community detection. It reports a geographical genealogy of the Chinese population that is result of population origins, historical migrations, and societal evolutions.
MATERIALS AND METHODS
We acquire data from the census records supplied by China's National Citizen Identity Information System, including the surname and regional information of 1.28 billion registered Chinese citizens. We propose a multilayer minimum spanning tree (MMST) to construct a spatial network based on the matrix of isonymic distances, which is often used to characterize the dissimilarity of surname structure among areas. We use the fast unfolding algorithm to detect network communities.
RESULTS
We obtain a 10‐layer MMST network of 362 prefecture nodes and 3,610 edges derived from the matrix of the Euclidean distances among these areas. These prefectures are divided into eight groups in the spatial network via community detection. We measure the partition by comparing the inter‐distances and intra‐distances of the communities and obtain meaningful regional ethnicity classification.
DISCUSSION
The visualization of the resulting communities on the map indicates that the prefectures in the same community are usually geographically adjacent. The formation of this partition is influenced by geographical factors, historic migrations, trade and economic factors, as well as isolation of culture and language. The MMST algorithm proves to be effective in geo‐genealogy and ethnicity classification for it retains essential information about surname affinity and highlights the geographical consanguinity of the population.National Natural Science Foundation of China, Grant/Award Numbers: 61773069, 71731002; National Social Science Foundation of China, Grant/Award Number: 14BSH024; Foundation of China of China Scholarships Council, Grant/Award Numbers: 201606045048, 201706040188, 201706040015; DOE, Grant/Award Number: DE-AC07-05Id14517; DTRA, Grant/Award Number: HDTRA1-14-1-0017; NSF, Grant/Award Numbers: CHE-1213217, CMMI-1125290, PHY-1505000 (61773069 - National Natural Science Foundation of China; 71731002 - National Natural Science Foundation of China; 14BSH024 - National Social Science Foundation of China; 201606045048 - Foundation of China of China Scholarships Council; 201706040188 - Foundation of China of China Scholarships Council; 201706040015 - Foundation of China of China Scholarships Council; DE-AC07-05Id14517 - DOE; HDTRA1-14-1-0017 - DTRA; CHE-1213217 - NSF; CMMI-1125290 - NSF; PHY-1505000 - NSF)Published versio
CAPIA: Cloud Assisted Privacy-Preserving Image Annotation
Using public cloud for image storage has become a prevalent trend with the rapidly increasing number of pictures generated by various devices. For example, today\u27s most smartphones and tablets synchronize photo albums with cloud storage platforms. However, as many images contain sensitive information, such as personal identities and financial data, it is concerning to upload images to cloud storage. To eliminate such privacy concerns in cloud storage while keeping decent data management and search features, a spectrum of keywords-based searchable encryption (SE) schemes have been proposed in the past decade. Unfortunately, there is a fundamental gap remains open for their support of images, i.e., appropriate keywords need to be extracted for images before applying SE schemes to them. On one hand, it is obviously impractical for smartphone users to manually annotate their images. On the other hand, although cloud storage services now offer image annotation services, they rely on access to users\u27 unencrypted images. To fulfill this gap and open the first path from SE schemes to images, this paper proposes a cloud assisted privacy-preserving automatic image annotation scheme, namely CAPIA. CAPIA enables cloud storage users to automatically assign keywords to their images by leveraging the power of cloud computing. Meanwhile, CAPIA prevents the cloud from learning the content of images and their keywords. Thorough analysis is carried out to demonstrate the security of CAPIA. A prototype implementation over the well-known IAPR TC-12 dataset further validates the efficiency and accuracy of CAPIA
- …